Exploiting Similarity for Multi-Source Downloads Using File Handprints

نویسندگان

  • Himabindu Pucha
  • David G. Andersen
  • Michael Kaminsky
چکیده

Many contemporary approaches for speeding up large file transfers attempt to download chunks of a data object from multiple sources. Systems such as BitTorrent quickly locate sources that have an exact copy of the desired object, but they are unable to use sources that serve similar but non-identical objects. Other systems automatically exploit cross-file similarity by identifying sources for each chunk of the object. These systems, however, require a number of lookups proportional to the number of chunks in the object and a mapping for each unique chunk in every identical and similar object to its corresponding sources. Thus, the lookups and mappings in such a system can be quite large, limiting its scalability. This paper presents a hybrid system that provides the best of both approaches, locating identical and similar sources for data objects using a constant number of lookups and inserting a constant number of mappings per object. We first demonstrate through extensive data analysis that similarity does exist among objects of popular file types, and that making use of it can sometimes substantially improve download times. Next, we describe handprinting, a technique that allows clients to locate similar sources using a constant number of lookups and mappings. Finally, we describe the design, implementation and evaluation of Similarity-Enhanced Transfer (SET), a system that uses this technique to download objects. Our experimental evaluation shows that by using sources of similar objects, SET is able to significantly out-perform an equivalently configured BitTorrent.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Accelerating File Downloads in Publish Subscribe Internetworking with Multisource and Multipath Transfers

We present mmFTP, a file transfer protocol for the Publish Subscribe Internetworking (PSI) architecture, which follows the Information-Centric Network (ICN) paradigm. mmFTP is designed to utilize diverse in-network resources: (i) it is receiver-driven, thus supporting on-path caching, (ii) it downloads files from multiple sources, thus utilizing off-path caching and (iii) it offers multipath tr...

متن کامل

روشی جدید برای اندازه‌گیری پارامترهای چشمه‌های آلفازای فوق اورانیوم

An alpha-gamma coincidence method, using multi-parameter analyzer (MPA) system in list mode was used for measuring parameters of transuranium sources. The benefit of using the list method is preventing duplication of experiments a lot and thus saving the cost and time taken for this type of testing that due to very small sample rate, spend a lot of time. The multi-parameter analyzer stores the ...

متن کامل

Cell-Level Modeling of IEEE 802.11 WLANs

We develop a scalable cell-level analytical model for multi-cell infrastructure IEEE 802.11 WLANs under a so-called Pairwise Binary Dependence (PBD) condition. The PBD condition is a geometric property under which the relative locations of the nodes inside a cell do not matter and the network is free of hidden nodes. For the cases of saturated nodes and TCP-controlled long-file downloads, we pr...

متن کامل

VoIP Call Performance Over IPv6 During HTTP and Bittorrent Downloads

We study the performance of a VoIP call in an IPv6 network during download of a large file using HTTP, BitTorrent, or uTP, and simultaneous downloads using HTTP and BitTorrent or HTTP and uTP. Performance metrics include maximum delta, maximum and mean jitter, throughput, packet loss, and perceived voice quality. The results indicate that the values of these metrics are stable and that call qua...

متن کامل

Measuring and Detecting Malware Downloads in Live Network Traffic

In this paper, we present AMICO, a novel system for measuring and detecting malware downloads in live web traffic. AMICO learns to distinguish between malware and benign file downloads from the download behavior of the network users themselves. Given a labeled dataset of past benign and malware file downloads, AMICO learns a provenance classifier that can accurately detect future malware downlo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007